Importance results in the form of boxplots split across FI method and learner type, separately for each dataset / DGP. Table 1 lists all DGPs used in the importance benchmark.

DGPs used for comparison, alongside their number of features (p) and a brief description. Unless otherwise specified, features in simulated data were drawn from a univariate normal distribution and do not have an effect on the target.
Dataset p Description Model
ewald 5 DGP used by Ewald et al. (2024) \(Y = X_4 + X_5 + X_4 X_5 + \varepsilon\)
correlated 4 \(X_1\) and \(X_2\) correlated (\(r = 0.25, 0.75\)) \(Y = 2 X_1 + X_3 + \varepsilon\)
interactions 5 \(X_1\) and \(X_2\) interact but have no direct effects \(Y = 2 X_1 X_2 + X_3 + \varepsilon\)
friedman1 10 Regression benchmark from mlbench \(Y = 10 \sin(\pi X_1 X_2) + 20 (X_3 - 0.5)^2 + 10 X_4 + 5 X_5 + \varepsilon\)
independent 5 Uncorrelated features with direct effects \(Y = 2 X_1 + X_2 + 0.5 X_3 + \varepsilon\)
confounded 4 Unobserved confounder \(H\) with proxy \(Y = H + X_1 + \varepsilon\)
mediated 4 Mediator masks exposure variable \(Y = 1.5 \cdot \mathrm{mediator} + 0.5 \cdot \mathrm{direct} + \varepsilon\)
bike sharing 12 Real-world dataset N/A

Importance (scaled)

Importances are scaled to percentages such that 100 is the highest importance value assigned by the corresponding method on a dataset in a given replication.

Confounded

Scaled feature importance scores for the Confounded task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the Confounded task by FI method and learner, colored by implementing package.

Correlated

Scaled feature importance scores for the Correlated (r = 0.25) task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the Correlated (r = 0.25) task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the Correlated (r = 0.75) task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the Correlated (r = 0.75) task by FI method and learner, colored by implementing package.

Ewald

Scaled feature importance scores for the Ewald task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the Ewald task by FI method and learner, colored by implementing package.

Friedman1

Scaled feature importance scores for the Friedman1 task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the Friedman1 task by FI method and learner, colored by implementing package.

Independent

Scaled feature importance scores for the Independent task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the Independent task by FI method and learner, colored by implementing package.

Interactions

Scaled feature importance scores for the Interactions task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the Interactions task by FI method and learner, colored by implementing package.

Mediated

Scaled feature importance scores for the mediated task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the mediated task by FI method and learner, colored by implementing package.

Bike sharing

Scaled feature importance scores for the bike sharing task by FI method and learner, colored by implementing package.

Scaled feature importance scores for the bike sharing task by FI method and learner, colored by implementing package.

Importance (ranks)

Importance scores are converted to ranks with 1 (leftmost) being the highest importance score, i.e., the most important feature as judged by the respective method. Boxplots across ranks illustrate how consistent the rankings are.

Confounded

Ranked feature importance scores for the Confounded task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the Confounded task by FI method and learner, colored by implementing package.

Correlated

Ranked feature importance scores for the Correlated (r = 0.25) task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the Correlated (r = 0.25) task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the Correlated (r = 0.75) task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the Correlated (r = 0.75) task by FI method and learner, colored by implementing package.

Ewald

Ranked feature importance scores for the Ewald task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the Ewald task by FI method and learner, colored by implementing package.

Friedman1

Ranked feature importance scores for the Friedman1 task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the Friedman1 task by FI method and learner, colored by implementing package.

Independent

Ranked feature importance scores for the Independent task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the Independent task by FI method and learner, colored by implementing package.

Interactions

Ranked feature importance scores for the Interactions task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the Interactions task by FI method and learner, colored by implementing package.

Mediated

Ranked feature importance scores for the mediated task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the mediated task by FI method and learner, colored by implementing package.

Bike sharing

Ranked feature importance scores for the bike sharing task by FI method and learner, colored by implementing package.

Ranked feature importance scores for the bike sharing task by FI method and learner, colored by implementing package.